Zurich - 27 & 28 June 2022
lavaanlavaan package with missing = "fiml".model <- ' #variance Ozone ~~ Ozone Solar.R ~~ Solar.R Wind ~~ Wind Temp ~~ Temp #correlation Ozone ~~ Solar.R + Wind + Temp Solar.R ~~ Wind + Temp Wind ~~ Temp ' fit <- sem(model, data = airquality, missing = "fiml", meanstructure = TRUE)
> > Parameter Estimates: > > Standard errors Standard > Information Observed > Observed information based on Hessian > > Covariances: > Estimate Std.Err z-value P(>|z|) Std.lv Std.all > Ozone ~~ > Solar.R 942.530 266.602 3.535 0.000 942.530 0.324 > Wind -64.636 11.033 -5.858 0.000 -64.636 -0.570 > Temp 209.563 31.267 6.702 0.000 209.563 0.687 > Solar.R ~~ > Wind -17.335 26.211 -0.661 0.508 -17.335 -0.055 > Temp 238.073 74.272 3.205 0.001 238.073 0.281 > Wind ~~ > Temp -15.172 2.946 -5.151 0.000 -15.172 -0.458 > > Intercepts: > Estimate Std.Err z-value P(>|z|) Std.lv Std.all > Ozone 41.871 2.782 15.048 0.000 41.871 1.296 > Solar.R 184.847 7.428 24.884 0.000 184.847 2.055 > Wind 9.958 0.284 35.076 0.000 9.958 2.836 > Temp 77.882 0.763 102.112 0.000 77.882 8.255 > > Variances: > Estimate Std.Err z-value P(>|z|) Std.lv Std.all > Ozone 1044.019 129.627 8.054 0.000 1044.019 1.000 > Solar.R 8090.702 950.667 8.511 0.000 8090.702 1.000 > Wind 12.330 1.410 8.746 0.000 12.330 1.000 > Temp 89.006 10.176 8.746 0.000 89.006 1.000
Output is quite large and gives a lot of information at once.
standardized = TRUE)
Use fmi = TRUE in the summary() function to get fraction of missing information.
fmi = the relative increase in variance and decrease of precision due to missing data, i.e. impact of missing data on estimates.
> Ozone Solr.R Wind Temp > [1,] 1 1 1 1 > [2,] 0 1 1 1 > [3,] 1 0 1 1 > [4,] 0 0 1 1
> Ozone Solr.R Wind Temp > Ozone 0.758 > Solar.R 0.725 0.954 > Wind 0.758 0.954 1.000 > Temp 0.758 0.954 1.000 1.000
> > Parameter Estimates: > > Standard errors Standard > Information Observed > Observed information based on Hessian > > Regressions: > Estimate Std.Err z-value P(>|z|) FMI > Ozone ~ > Solar.R 0.127 0.032 3.915 0.000 0.223 > > Intercepts: > Estimate Std.Err z-value P(>|z|) FMI > .Ozone 18.599 6.687 2.781 0.005 0.218 > > Variances: > Estimate Std.Err z-value P(>|z|) FMI > .Ozone 964.164 129.421 7.450 0.000 0.240
> term estimate std.error statistic df p.value > 1 (Intercept) 19.53398 7.24259142 2.697098 20.14387 0.013810123 > 2 Solar.R 0.11901 0.03438927 3.460673 22.04619 0.002219167
> > Parameter Estimates: > > Standard errors Standard > Information Observed > Observed information based on Hessian > > Regressions: > Estimate Std.Err z-value P(>|z|) FMI > Ozone ~ > Solar.R 0.112 0.030 3.701 0.000 0.158 > > Covariances: > Estimate Std.Err z-value P(>|z|) FMI > Solar.R ~~ > Wind -17.069 26.123 -0.653 0.514 0.046 > Temp 232.954 73.899 3.152 0.002 0.077 > Month -7.321 10.557 -0.693 0.488 0.056 > Day -119.301 66.532 -1.793 0.073 0.051 > .Ozone ~~ > Wind -63.332 10.615 -5.966 0.000 0.095 > Temp 183.490 28.455 6.448 0.000 0.102 > Month 8.240 3.735 2.206 0.027 0.090 > Day 6.540 23.317 0.280 0.779 0.134 > Wind ~~ > Temp -15.172 2.946 -5.151 0.000 -0.000 > Month -0.884 0.407 -2.171 0.030 -0.000 > Day 0.843 2.509 0.336 0.737 -0.000 > Temp ~~ > Month 5.607 1.168 4.799 0.000 -0.000 > Day -10.886 6.796 -1.602 0.109 0.000 > Month ~~ > Day -0.099 1.009 -0.098 0.922 -0.000 > > Intercepts: > Estimate Std.Err z-value P(>|z|) FMI > .Ozone 21.819 6.194 3.523 0.000 0.152 > Solar.R 185.534 7.410 25.038 0.000 0.042 > Wind 9.958 0.284 35.076 0.000 0.000 > Temp 77.882 0.763 102.112 0.000 0.000 > Month 6.993 0.114 61.269 0.000 0.000 > Day 15.804 0.714 22.125 0.000 0.000 > > Variances: > Estimate Std.Err z-value P(>|z|) FMI > .Ozone 943.445 119.142 7.919 0.000 0.180 > Solar.R 8050.793 941.698 8.549 0.000 0.045 > Wind 12.330 1.410 8.746 0.000 -0.000 > Temp 89.006 10.176 8.746 0.000 -0.000 > Month 1.993 0.228 8.746 0.000 0.000 > Day 78.066 8.925 8.746 0.000 0.000
semTools package> > Parameter Estimates: > > Standard errors Standard > Information Observed > Observed information based on Hessian > > Regressions: > Estimate Std.Err z-value P(>|z|) FMI > Ozone ~ > Solar.R 0.112 0.030 3.701 0.000 0.158 > > Covariances: > Estimate Std.Err z-value P(>|z|) FMI > Wind ~~ > Temp -15.172 2.946 -5.151 0.000 0.000 > Month -0.884 0.407 -2.171 0.030 -0.000 > Day 0.843 2.509 0.336 0.737 0.000 > Temp ~~ > Month 5.607 1.168 4.799 0.000 0.000 > Day -10.886 6.796 -1.602 0.109 0.000 > Month ~~ > Day -0.099 1.009 -0.098 0.922 -0.000 > Wind ~~ > .Ozone -63.332 10.615 -5.966 0.000 0.095 > Temp ~~ > .Ozone 183.490 28.455 6.448 0.000 0.102 > Month ~~ > .Ozone 8.240 3.735 2.206 0.027 0.090 > Day ~~ > .Ozone 6.540 23.317 0.280 0.779 0.134 > Wind ~~ > Solar.R -17.069 26.123 -0.653 0.514 0.046 > Temp ~~ > Solar.R 232.954 73.899 3.152 0.002 0.077 > Month ~~ > Solar.R -7.321 10.557 -0.693 0.488 0.056 > Day ~~ > Solar.R -119.301 66.532 -1.793 0.073 0.051 > > Intercepts: > Estimate Std.Err z-value P(>|z|) FMI > .Ozone 21.819 6.194 3.523 0.000 0.152 > Solar.R 185.534 7.410 25.038 0.000 0.042 > Wind 9.958 0.284 35.076 0.000 0.000 > Temp 77.882 0.763 102.112 0.000 0.000 > Month 6.993 0.114 61.269 0.000 0.000 > Day 15.804 0.714 22.125 0.000 0.000 > > Variances: > Estimate Std.Err z-value P(>|z|) FMI > .Ozone 943.445 119.142 7.919 0.000 0.180 > Solar.R 8050.793 941.698 8.549 0.000 0.045 > Wind 12.330 1.410 8.746 0.000 -0.000 > Temp 89.006 10.176 8.746 0.000 0.000 > Month 1.993 0.228 8.746 0.000 -0.000 > Day 78.066 8.925 8.746 0.000 0.000
> term estimate std.error statistic df p.value > 1 (Intercept) 20.4707060 6.42380402 3.186695 57.45493 0.0023279422 > 2 Solar.R 0.1135789 0.02925834 3.881931 109.81512 0.0001772778
> item1 item2 item3 item4 item5 Total.score > 1 1 0 0 0 1 2 > 2 NA 1 0 NA 1 NA > 3 NA NA NA NA NA NA
Both lead to a missing total score
> Q1i1 Q1i2 Q1i3 Q1i4 Q1i5 TSQ1 > 1 NA NA NA 2 1 NA > 2 2 5 1 1 1 10 > 3 1 1 1 1 1 5 > 4 1 1 2 4 1 9 > 5 5 5 1 5 5 21
x %>%
#compute average over available items (AAI)
mutate(AAI = rowMeans(select(.,Q1i1, Q1i2, Q1i3, Q1i4, Q1i5), na.rm = T)) %>%
#then apply rule to all items that if the score is missing, to replace it with AAI
mutate_at(.vars = vars(Q1i1:Q1i5),
.funs = list(~ ifelse(is.na(.), AAI, .))) %>%
mutate(TSQ1 = rowSums(select(.,Q1i1, Q1i2, Q1i3, Q1i4, Q1i5)))
> Q1i1 Q1i2 Q1i3 Q1i4 Q1i5 TSQ1 AAI > 1 1.5 1.5 1.5 2 1 7.5 1.5 > 2 2.0 5.0 1.0 1 1 10.0 2.0 > 3 1.0 1.0 1.0 1 1 5.0 1.0 > 4 1.0 1.0 2.0 4 1 9.0 1.8 > 5 5.0 5.0 1.0 5 5 21.0 4.2
Imputation model can grow large when all items are used
When the total score is used in analyses, the total score should be used as predictor for other variables
For Q1:
> Q1i1 Q1i2 Q1i3 Q1i4 Q1i5 Q2i1 Q2i2 Q2i3 Q2i4 Q2i5 TSQ1 TSQ2 cov1 > Q1i1 0 1 1 1 1 0 0 0 0 0 1 1 1 > Q1i2 1 0 1 1 1 0 0 0 0 0 1 1 1 > Q1i3 1 1 0 1 1 0 0 0 0 0 1 1 1 > Q1i4 1 1 1 0 1 0 0 0 0 0 1 1 1 > Q1i5 1 1 1 1 0 0 0 0 0 0 1 1 1
For Q2:
> Q1i1 Q1i2 Q1i3 Q1i4 Q1i5 Q2i1 Q2i2 Q2i3 Q2i4 Q2i5 TSQ1 TSQ2 cov1 > Q2i1 0 0 0 0 0 0 1 1 1 1 1 1 1 > Q2i2 0 0 0 0 0 1 0 1 1 1 1 1 1 > Q2i3 0 0 0 0 0 1 1 0 1 1 1 1 1 > Q2i4 0 0 0 0 0 1 1 1 0 1 1 1 1 > Q2i5 0 0 0 0 0 1 1 1 1 0 1 1 1
> Q1i1 Q1i2 Q1i3 Q1i4 Q1i5 Q2i1 Q2i2 Q2i3 Q2i4 Q2i5 TSQ1 TSQ2 cov1 > Q1i1 0 1 1 1 1 0 0 0 0 0 0 1 1 > Q1i2 1 0 1 1 1 0 0 0 0 0 0 1 1 > Q1i3 1 1 0 1 1 0 0 0 0 0 0 1 1 > Q1i4 1 1 1 0 1 0 0 0 0 0 0 1 1 > Q1i5 1 1 1 1 0 0 0 0 0 0 0 1 1 > Q2i1 0 0 0 0 0 0 1 1 1 1 1 0 1 > Q2i2 0 0 0 0 0 1 0 1 1 1 1 0 1 > Q2i3 0 0 0 0 0 1 1 0 1 1 1 0 1 > Q2i4 0 0 0 0 0 1 1 1 0 1 1 0 1 > Q2i5 0 0 0 0 0 1 1 1 1 0 1 0 1
> Q1i1 Q1i2 Q1i3 Q1i4 Q1i5 Q2i1 Q2i2 Q2i3 Q2i4 Q2i5 TSQ1 TSQ2 cov1 > TSQ1 0 0 0 0 0 0 0 0 0 0 0 1 1 > TSQ2 0 0 0 0 0 0 0 0 0 0 1 0 1 > cov1 0 0 0 0 0 0 0 0 0 0 1 1 0
> TSQ1 TSQ2 cov1 > TSQ1 0 1 1 > TSQ2 1 0 1 > cov1 1 1 0
> Q1i1 Q1i2 Q1i3 Q1i4 Q1i5 Q2i1 Q2i2 Q2i3 Q2i4 Q2i5 TSQ1 TSQ2 cov1 > Q1i1 0 1 1 1 1 0 0 0 0 0 0 1 1 > Q1i2 1 0 1 1 1 0 0 0 0 0 0 1 1 > Q1i3 1 1 0 1 1 0 0 0 0 0 0 1 1 > Q1i4 1 1 1 0 1 0 0 0 0 0 0 1 1 > Q1i5 1 1 1 1 0 0 0 0 0 0 0 1 1 > Q2i1 0 0 0 0 0 0 1 1 1 1 1 0 1 > Q2i2 0 0 0 0 0 1 0 1 1 1 1 0 1 > Q2i3 0 0 0 0 0 1 1 0 1 1 1 0 1 > Q2i4 0 0 0 0 0 1 1 1 0 1 1 0 1 > Q2i5 0 0 0 0 0 1 1 1 1 0 1 0 1 > TSQ1 0 0 0 0 0 0 0 0 0 0 0 1 1 > TSQ2 0 0 0 0 0 0 0 0 0 0 1 0 1 > cov1 0 0 0 0 0 0 0 0 0 0 1 1 0
During each iteration for Q1:
1. Impute item scores using items from its own questionnaire, total score(s) from other questionnaires and covariate(s).
2. Total score is re-calculated using the imputed item scores.
3. Updated total score is used as predictor for covariate(s) and items of other questionnaires in next iteration.
Note the \(_i\) indicates impute value from the previous iteration.
> TSQ1 TSQ2 > "pmm" "pmm"
> TSQ1 TSQ2 > "~I(Q1i1 + Q1i2 + Q1i3 + Q1i4 + Q1i5)" "~I(Q2i1 + Q2i2 + Q2i3 + Q2i4 + Q2i5)"
R the lavaan package - simulates Mplus> id group T0 T1 T2 > 1 1 0 -0.89273971 0.62595050 1.6787242 > 2 2 1 1.09466683 1.66393217 2.6314272 > 3 3 0 0.53677979 1.99526687 3.0587021 > 4 4 1 -0.08057365 1.35645288 1.5600102 > 5 5 0 -0.24587013 -0.06647697 0.5832191 > 6 6 1 0.60444222 1.45718160 2.7469386 > 7 7 0 0.00417498 0.21136080 1.4365193 > 8 8 1 1.94695848 2.76345288 3.6306766 > 9 9 0 -2.22879834 -0.75850320 0.8149130 > 10 10 1 -0.23124489 1.28434128 2.1040148 > 11 11 0 1.53801265 2.04635754 2.6932853 > 12 12 1 0.65401059 1.46172168 2.0400761 > 13 13 0 -0.29052400 -0.94582269 1.3305414 > 14 14 1 0.30578755 1.01441996 2.6523174 > 15 15 0 -0.70110697 0.67496886 1.4098061
> id group time outcome > 1 1 0 T0 -0.89273971 > 2 1 0 T1 0.62595050 > 3 1 0 T2 1.67872419 > 4 2 1 T0 1.09466683 > 5 2 1 T1 1.66393217 > 6 2 1 T2 2.63142721 > 7 3 0 T0 0.53677979 > 8 3 0 T1 1.99526687 > 9 3 0 T2 3.05870208 > 10 4 1 T0 -0.08057365 > 11 4 1 T1 1.35645288 > 12 4 1 T2 1.56001019 > 13 5 0 T0 -0.24587013 > 14 5 0 T1 -0.06647697 > 15 5 0 T2 0.58321912
Data set for two groups with measurements at three time-points for:
In total: 13 variables in wide imputation. Can grow large when more variables are measured at each time-point.
Example predictormatrix for cov1
> outcome_T0 outcome_T1 outcome_T2 cov1_T0 cov1_T1 cov1_T2 cov2_T0 > cov1_T0 1 1 1 0 1 1 1 > cov1_T1 1 1 1 1 0 1 0 > cov1_T2 1 1 1 1 1 0 0 > cov2_T1 cov2_T2 cov3_T0 cov3_T1 cov3_T2 > cov1_T0 0 0 1 0 0 > cov1_T1 1 0 0 1 0 > cov1_T2 0 1 0 0 1
miceadds package contains many additional methodsbroom.mixed to enable the pool function for the mice output.-2 for cluster variable id in predictormatrix (random intercept)> id group time outcome cov1 cov2 cov3 > id 0 1 1 1 1 1 1 > group -2 0 1 1 1 1 1 > time -2 1 0 1 1 1 1 > outcome -2 1 1 0 1 1 1 > cov1 -2 1 1 1 0 1 1 > cov2 -2 1 1 1 1 0 1 > cov3 -2 1 1 1 1 1 0
2l.pmm> id group time outcome cov1 cov2 cov3 > "" "" "" "2l.pmm" "2l.pmm" "" ""
Missing covariate data are handled by listwise deletion
> id group time cov2 cov3 cov1 outcome > 111 1 1 1 1 1 1 1 0 > 20 1 1 1 1 1 1 0 1 > 18 1 1 1 1 1 0 1 1 > 1 1 1 1 1 1 0 0 2 > 0 0 0 0 0 19 21 40
> # A tibble: 4 x 8 > effect group term estimate std.error statistic conf.low conf.high > <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> > 1 fixed <NA> (Intercept) 1.06 0.130 8.10 0.801 1.31 > 2 fixed <NA> cov2 0.0190 0.115 0.166 -0.206 0.244 > 3 ran_pars id sd__(Interc~ 0.589 NA NA NA NA > 4 ran_pars Residual sd__Observa~ 1.11 NA NA NA NA
> term estimate std.error statistic df p.value > 1 (Intercept) 1.06633608 0.1325362 8.0456241 111.34485 1.018519e-12 > 2 cov2 0.01953625 0.1163277 0.1679415 86.62942 8.670208e-01